Skip to content

#864 development constant tail bug#868

Open
kennethshsu wants to merge 27 commits into
mainfrom
#864_DevelopmentConstant_Tail_Bug
Open

#864 development constant tail bug#868
kennethshsu wants to merge 27 commits into
mainfrom
#864_DevelopmentConstant_Tail_Bug

Conversation

@kennethshsu
Copy link
Copy Markdown
Collaborator

@kennethshsu kennethshsu commented May 27, 2026

Summary of Changes

Addressed two bugs in the DevelopmentConstant()

  1. When the supplied pattern is longer than the triangle, everything from the end of the tail is incorrectly discarded.
  2. When the supplied pattern is a LDF (instead of CDF), the algorithm fails to convert it to CDF first.

Related GitHub Issue(s)

Fixes #864

Additional Context for Reviewers

This PR fixes both bugs, even though only 1 is reported on #864.
There was also an old bug(?) in test_constant_callable_axis1, not sure why patterns.values had patterns.values[:, :-1] dropped the last value. This is corrected.

  • I passed tests locally for both code (uv run pytest) and documentation changes (uv run jb build docs --builder=custom --custom-builder=doctest)

Note

Medium Risk
Changes core actuarial development-factor logic used in reserving workflows; behavior shifts for long LDF patterns and tails, though covered by extensive new tests.

Overview
DevelopmentConstant.fit is reworked so external CDF/LDF patterns align with triangle length and tails are preserved instead of dropped when patterns extend past the triangle.

A new _prepare_cdf_patterns converts LDF inputs to CDFs, rebases in-triangle CDFs when a tail exists, and returns a tail_cdf applied to the last LDF. fit now warns and fills missing ages with 1.0 when patterns are shorter than the triangle, handles callable patterns per row with the same tail logic, and supports incremental triangles via not X.is_cumulative.

test_constant.py gains broad regression coverage (exact/short/long patterns, tail vs no tail, LDF vs CDF, incremental) and fixes test_constant_callable_axis1 to compare full CDF values without incorrectly dropping the last period.

Reviewed by Cursor Bugbot for commit 6f28c21. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.19%. Comparing base (449b5c1) to head (6f28c21).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #868      +/-   ##
==========================================
+ Coverage   87.04%   87.19%   +0.15%     
==========================================
  Files          86       86              
  Lines        4986     5030      +44     
  Branches      646      655       +9     
==========================================
+ Hits         4340     4386      +46     
+ Misses        456      455       -1     
+ Partials      190      189       -1     
Flag Coverage Δ
unittests 87.19% <100.00%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread chainladder/development/constant.py Outdated
Comment thread chainladder/development/constant.py Outdated
Comment thread chainladder/development/constant.py Outdated
Comment thread chainladder/development/constant.py Outdated
Comment thread chainladder/development/constant.py Outdated
@kennethshsu kennethshsu marked this pull request as ready for review May 27, 2026 23:30
@kennethshsu kennethshsu requested a review from jbogaardt as a code owner May 27, 2026 23:30
Comment thread chainladder/development/constant.py Outdated
Comment on lines +113 to +123
if len(cdf_patterns) < tri_dev_periods:
warnings.warn(
"Supplied patterns are shorter than the triangle development "
"periods. Missing ages will be filled with a factor of 1.0.",
UserWarning,
stacklevel=2,
)
tail_cdf = 1

elif len(cdf_patterns) == tri_dev_periods:
tail_cdf = 1
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning is new, I thought about not giving one and just do it quietly, but I think this is better since I did the same in throwing a warning when I put in additional assumptions for the users in the cl.Development() method.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look I think I can discard the tail_cdf = 1. That's already set above.

Comment thread chainladder/development/constant.py Outdated
Comment on lines +127 to +130
if pattern_dev_periods < tri_dev_periods:
include_last = False
elif pattern_dev_periods == tri_dev_periods:
include_last = True
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to address problem 2 in the main thread. Usually, the last age to age factor is not included (it assumes that the oldest origin row is at ultimate, e.g. if it's age 120, the last age to ultimate would be 102-ult, NOT 120-ult. Rewriting this broke a bunch of other things, so I kept that here.

If we need to somehow include 120-ult, because they are the same length, we'll do that here.

else:
include_last = tail_cdf != 1

dev_slice = slice(None) if include_last else slice(None, -1)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to slice that development, depends on if we drop the last factor (e.g. 120-ult).

dev_slice = slice(None) if include_last else slice(None, -1)

# this is the object to fill out the patterns, skeleton frame
obj = obj.iloc[..., :1, dev_slice] * 0 + 1
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, we are back to the old method, obj is just the blank age to age (NOT age to ult) to fill out.

Comment on lines +150 to +165
ldf = (
pd.concat(
[pd.DataFrame(item[0], index=[0]) for item in prepared],
axis=0,
)
.fillna(1)[obj.ddims]
.values
)
tail_cdfs = xp.array([item[1] for item in prepared])

if self.callable_axis == 0:
ldf = xp.array(ldf[:, None, None, :])
tail_cdfs = tail_cdfs[:, None, None]
else:
ldf = xp.array(ldf[None, :, None, :])
tail_cdfs = tail_cdfs[None, :, None]
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to the old stuff, I didn't touch.

Comment on lines +141 to +147
def _callable_row(row):
raw_patterns = self.patterns(row)
cdf_row, row_tail_cdf = self._prepare_cdf_patterns(
raw_patterns, tri_dev_periods
)
fit_row = raw_patterns if self.style == "ldf" else cdf_row
return dict(fit_row), row_tail_cdf
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rows stuff is from the above, when it's callable. This is super annoying is because the pattern needs to stay the same in either LDF or CDF form, but just get the tail out.

Comment thread chainladder/development/constant.py Outdated
Comment on lines +169 to +174
if not any(ddim == k or int(ddim) == int(k) for k in cdf_patterns):
cdf_patterns[int(ddim)] = 1.0
if self.style == "ldf" and not any(
ddim == k or int(ddim) == int(k) for k in patterns
):
patterns[int(ddim)] = 1.0
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fill in the cdf patterns with 1.0 if something is missing (i.e. blank spaces).

Comment thread chainladder/development/constant.py Outdated
):
patterns[int(ddim)] = 1.0

fit_patterns = patterns if self.style == "ldf" else cdf_patterns
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still keeping track of both LDFs and CDFs.

Comment on lines +177 to 178
ldf = xp.array([float(fit_patterns[int(item)]) for item in obj.ddims])
ldf = ldf[None, None, None, :]
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before.

ldf = xp.concatenate((ldf[..., :-1] / ldf[..., 1:], ldf[..., -1:]), -1)

# apply tail_cdf to the last ldfs of the triangle
ldf[..., -1] = ldf[..., -1] * tail_cdfs
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is after following through everything, and just need to multiply the last element with the tail.

@kennethshsu
Copy link
Copy Markdown
Collaborator Author

Hopefully this helps a bit. I can try to improve the readability to reduce tech debt a bit more tomorrow.

@kennethshsu
Copy link
Copy Markdown
Collaborator Author

@henrydingliu do you have Claude code?

@henrydingliu
Copy link
Copy Markdown
Collaborator

Hopefully this helps a bit. I can try to improve the readability to reduce tech debt a bit more tomorrow.

it's a lot of catching of corner cases. i actually just had an idea after reviewing your other PR. you know how @genedan created a fake 2nd latest diagonal in all the friedland data? what if we just take that idea to implement DevelopmentConstant()? then we'd be leveraging all the corner case catching that's built into Development() itself. basically, when calling the DevelopConstant().fit(), you just create a triangle with all 1s, then override the latest diagonal with the pattern (and override the second latest diagonal with the pattern if cdf), and run a Development(n_periods=1) to get the ldf_.

do you have Claude code?

i develop in databricks. so i just use the built-in AI for basic syntax type acceleration/debugging. this is what the databricks AI tells me when I ask what model it is.

I’m an OpenAI-provided assistant accessed through Databricks, but the specific model identifier may be abstracted away by the platform. If you need the exact deployment/model name, the best place to check is your Databricks or API configuration where this assistant was set up.

I also have a paid copilot 365. i do a lot of ideation and high-level solution design through that. Claude Opus is one of the models available.

@kennethshsu
Copy link
Copy Markdown
Collaborator Author

do you have Claude code?

i develop in databricks. so i just use the built-in AI for basic syntax type acceleration/debugging. this is what the databricks AI tells me when I ask what model it is.

I’m an OpenAI-provided assistant accessed through Databricks, but the specific model identifier may be abstracted away by the platform. If you need the exact deployment/model name, the best place to check is your Databricks or API configuration where this assistant was set up.

I also have a paid copilot 365. i do a lot of ideation and high-level solution design through that. Claude Opus is one of the models available.

Hmm ok, I tried a few times to ask Cursor to refactor my code and it worked pretty terribly. I am going to go through another pass to do more cleanup but feel free to give it a try!

).fit_transform(raa)
assert np.all(
np.round(result.cdf_.to_frame().values.flatten(), 6)
== np.array([1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this the same problem as what you originally wanted to address? i.e. fed 11 cdf into developmentconstant but only get 10 back

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, okay. the original code only gave you 9 back

Copy link
Copy Markdown
Collaborator

@henrydingliu henrydingliu May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on third thought, i think this test actually shows a bug. the ldf_ would show 120-132 to be 1.1, which is not what was originally supplied.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you think the RHS is wrong? Why? The cdf from 120-ult is 1.1, as supplied?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not saying that the rhs is wrong. i'm saying this test isn't capturing an error in the untested ldf_. the ldf_ at 120 coming out of DevelopmentConstant is 1.1, but the actual ldf at 120 that was supplied to the estimator was 1.

obj = self._set_fit_groups(X).val_to_dev().copy()

xp = obj.get_array_module()
obj = obj.iloc[..., :1, :-1]*0+1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is the only line you have to change.

if self.style == "cdf":
    obj = obj.iloc[..., :1, :]*0+1
else:
    obj = obj.iloc[..., :1, :-1]*0+1

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, that's the lazy way. You aren't going to catch all the edge cases. Just bring my tests in and try your code. You'll fail a bunch.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obj = obj.iloc[..., :1, :]*0+1 is fundamentally changing the structure, this is now saying all development period will now develop one more, including the oldest origin period.

And if it's LDF style, then you don't? This is wrong.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to bring my tests in, and you can see if you can catch all the edge cases more cleanly. I'm sure there's a way.

I think for this PR, reviewing the tests is actually more important than the code itself.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, that's the lazy way. You aren't going to catch all the edge cases. Just bring my tests in and try your code. You'll fail a bunch.

it definitely is the lazy way. it's also done in this PR, literally just a few lines down

obj = obj.iloc[..., :1, dev_slice] * 0 + 1

Try to bring my tests in, and you can see if you can catch all the edge cases more cleanly

good idea. i can do that

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and if you want to try to use the tests and let AI solve it, you can give that a try if you have access to a good AI agent.

Cursor couldn't do it and just kept iterating itself until I killed it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented an alternative approach in this branch. you can find the diff here. I couldn't figure out how to pass all of your tests. So instead, I'm choosing to declare two of your tests defective :P

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, it's very possible I made mistakes on the tests. So let's make sure the tests are right.

Your implementation is much cleaner, let's just make sure it can catch all the edge cases. I think we are super close.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think each test needs to test both the ldf_ and the cdf_. we'd just have to manually calculate the cdf/ldf from the supplied vector.

1.4641,
1.331,
1.21,
]
Copy link
Copy Markdown
Collaborator

@henrydingliu henrydingliu May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this test actually shows a bug. the ldf_ would show 120-132 to be 1.21, which is not what was originally supplied.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CDF, not LDF, is 1.21. Line 365 checks the CDF.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the cdf is 121. that is correct. but the underlying ldf_ at 120 is also 1.21, which is not what was supplied to the estimator

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check... I will add tests to check both the cdf_ and ldf_

@henrydingliu henrydingliu mentioned this pull request May 29, 2026
1 task
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6f28c21. Configure here.

else:
raise ValueError("callable axis needs to be 0 or 1")

patterns = self.patterns(rows.iloc[0])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callable path determines shape from first row only

Low Severity

When patterns is callable, self.patterns(rows.iloc[0]) is called to determine include_last and dev_slice based on the first row's pattern length. Each subsequent row is then independently processed in _callable_row, which may produce a different row_tail_cdf. If different rows return patterns of different lengths, the obj skeleton shape (determined solely by the first row) may be inappropriate for other rows — for example, if the first row's pattern is short (include_last=False) but another row's is long, the obj.ddims will have one fewer period than that row needs.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6f28c21. Configure here.

)
assert np.all(
np.round(result.ldf_.to_frame().values.flatten(), 6)
== np.array([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't match "reported_pattern"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not supposed to be, the reported_pattern is in CDF form, but the LHS/RHS check here is in LDF form.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you agree lol.

I had to be super careful, made many mistakes before. Just double check to see if you are aligned.

Copy link
Copy Markdown
Collaborator

@henrydingliu henrydingliu May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 1.1 comes at the 10th in the returned ldf_, instead of the 11th element implied by the original pattern

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the data object doesn't have the 11th origin period?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. so we have an issue here because the ldf_ returns changes, beyond filling with additional 1.0's, depending on how large the triangle is. in theory, we would want DevelopmentConstant().fit(trI_9x9).ldf_[:-1] to be equal to DevelopmentConstant().fit(trI_8x8).ldf_ but that's not happening

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me reply you on the main thread.

1.1,
1.1,
1.1,
1.1**2,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't match reported_patterns

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does? The only thing that's different is the last LDF, those need to be grouped, or it will be incorrectly discarded.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no 1.21 in the original ldf pattern.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes because the pattern extends beyond what is needed by the data object. Are you suggesting that you just discard the last one?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the PR is what's discarding the last ldf. i'm saying we need to keep the last ldf, that would make it consistent with Development

raa = cl.load_sample('raa')
dev = cl.Development().fit(raa)
raa_1987 = raa[(raa.valuation <= '1988-01-01')]
print(dev.transform(raa).ldf_)
print(dev.transform(raa_1987).ldf_)
          12-24     24-36     36-48     48-60     60-72     72-84     84-96    96-108   108-120
(All)  2.999359  1.623523  1.270888  1.171675  1.113385  1.041935  1.033264  1.016936  1.009217
          12-24     24-36     36-48     48-60     60-72     72-84     84-96    96-108   108-120
(All)  2.999359  1.623523  1.270888  1.171675  1.113385  1.041935  1.033264  1.016936  1.009217

@kennethshsu
Copy link
Copy Markdown
Collaborator Author

right. so we have an issue here because the ldf_ returns changes, beyond filling with additional 1.0's, depending on how large the triangle is. in theory, we would want DevelopmentConstant().fit(trI_9x9).ldf_[:-1] to be equal to DevelopmentConstant().fit(trI_8x8).ldf_ but that's not happening

So I think the above is only true if the pattern supplied is shorter than a length of 7. Try this example, and comment/uncomment the pattern length, everything is working as expected to me.

When pattern is ldf style:

raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']

reported_patterns = {
    12: 1.1,
    24: 1.1,
    36: 1.1,
    48: 1.1,
    60: 1.1,
    72: 1.1,
    84: 1.1,
    96: 1.1,
    108: 1.1,
    # 120: 1.1,
    # 132: 1.1,
}

raa10x10_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="ldf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="ldf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)

In cdf style:

raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']

reported_patterns = {
    12: 1.1,
    24: 1.1,
    36: 1.1,
    48: 1.1,
    60: 1.1,
    72: 1.1,
    84: 1.1,
    96: 1.1,
    108: 1.1,
    # 120: 1.1,
    # 132: 1.1,
}

raa10x10_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="cdf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="cdf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)

@kennethshsu
Copy link
Copy Markdown
Collaborator Author

Sorry, but I am not following your question/feedback, but can you give me an example of where you think my implementation doesn't give the correct answer, and what the correct answer should be?

@henrydingliu
Copy link
Copy Markdown
Collaborator

right. so we have an issue here because the ldf_ returns changes, beyond filling with additional 1.0's, depending on how large the triangle is. in theory, we would want DevelopmentConstant().fit(trI_9x9).ldf_[:-1] to be equal to DevelopmentConstant().fit(trI_8x8).ldf_ but that's not happening

So I think the above is only true if the pattern supplied is shorter than a length of 7. Try this example, and comment/uncomment the pattern length, everything is working as expected to me.

When pattern is ldf style:

raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']

reported_patterns = {
    12: 1.1,
    24: 1.1,
    36: 1.1,
    48: 1.1,
    60: 1.1,
    72: 1.1,
    84: 1.1,
    96: 1.1,
    108: 1.1,
    # 120: 1.1,
    # 132: 1.1,
}

raa10x10_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="ldf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="ldf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)

In cdf style:

raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']

reported_patterns = {
    12: 1.1,
    24: 1.1,
    36: 1.1,
    48: 1.1,
    60: 1.1,
    72: 1.1,
    84: 1.1,
    96: 1.1,
    108: 1.1,
    # 120: 1.1,
    # 132: 1.1,
}

raa10x10_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="cdf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="cdf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)

right. when the supplied ldf or cdf is longer, the resulting ldf_ is distorted. right now you are forcing those tests to pass by comparing the resulting ldf_ to something other than what was supplied.

@henrydingliu
Copy link
Copy Markdown
Collaborator

Sorry, but I am not following your question/feedback, but can you give me an example of where you think my implementation doesn't give the correct answer, and what the correct answer should be?

this implementation doesn't give the right answer when the supplied pattern is longer. the correct answer should replicate the supplied ldf or the implied ldf of the suppled cdf exactly (not counting factors of 1 to fill the space).

basically, if elsewhere in the package we want stuff like read_json(to_jason) to give us back the original. then DevelopmentConstant(pattern).ldf_ should always give back the pattern. whether the fitted triangle is longer or shorter is irrelevant. this is evidenced by

raa = cl.load_sample('raa')
dev = cl.Development().fit(raa)
raa_1987 = raa[(raa.valuation <= '1988-01-01')]
print(dev.transform(raa).ldf_)
print(dev.transform(raa_1987).ldf_)

@kennethshsu
Copy link
Copy Markdown
Collaborator Author

Sorry, but I am not following your question/feedback, but can you give me an example of where you think my implementation doesn't give the correct answer, and what the correct answer should be?

this implementation doesn't give the right answer when the supplied pattern is longer. the correct answer should replicate the supplied ldf or the implied ldf of the suppled cdf exactly (not counting factors of 1 to fill the space).

I don't think I agree with this. You are saying to just discard the extra LDF pattern beyond the triangle object? Look at this example:

raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']

reported_patterns = {
    12: 1.1,
    24: 1.1,
    36: 1.1,
    48: 1.1,
    60: 1.1,
    72: 1.1,
    84: 1.1,
    96: 1.1,
    108: 1.1, #9th LDF, or if this is a CDF, it would've been 1.21
    120: 1.1, #10th's LDF, not CDF
    # 132: 1.1,
}

raa10x10_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="ldf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
    patterns=reported_patterns, style="ldf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)

Returns:

       12-24  24-36  36-48  48-60  60-72  72-84  84-96  96-108  108-120  120-132
(All)    1.1    1.1    1.1    1.1    1.1    1.1    1.1     1.1      1.1      1.1
       12-24  24-36  36-48  48-60  60-72  72-84  84-96  96-108  108-120
(All)    1.1    1.1    1.1    1.1    1.1    1.1    1.1     1.1     1.21
                                                                  # ^ this 9th's LDF should 
                                                                  # be 1.1 instead of 1.21

You are saying, we should just discard the remaining 1.1 ldf from 120-132 (120-ult) if the triangle data is shorter?

basically, if elsewhere in the package we want stuff like read_json(to_jason) to give us back the original. then DevelopmentConstant(pattern).ldf_ should always give back the pattern. whether the fitted triangle is longer or shorter is irrelevant. this is evidenced by

raa = cl.load_sample('raa')
dev = cl.Development().fit(raa)
raa_1987 = raa[(raa.valuation <= '1988-01-01')]
print(dev.transform(raa).ldf_)
print(dev.transform(raa_1987).ldf_)

This is different, in this example, you are estimating patterns using a set of data, and estimating another pattern using a subset of that data, it should be clear that the pattern estimated using the subset lacks something (i.e. the tail).

Let me try to se if I can convince you with another example:

Do you agree that these two patterns are the same?

DC_LDF = cl.DevelopmentConstant(
    patterns={
    12: 1.1,
    24: 1.1,
    36: 1.1,
    48: 1.1,
    60: 1.1,
    72: 1.1,
    84: 1.1,
    96: 1.1,
    108: 1.1,
    120: 1.1,
}, style="ldf"
)
DC_CDF = cl.DevelopmentConstant(
    patterns={
    12: 1.1**10,
    24: 1.1**9,
    36: 1.1**8,
    48: 1.1**7,
    60: 1.1**6,
    72: 1.1**5,
    84: 1.1**4,
    96: 1.1**3,
    108: 1.1**2,
    120: 1.1,
}, style="cdf"
)

If so, then you should get the same CDF, not LDF, no matter what triangle object you fit on.

Again, in my opinion, one of the current implementation flaw that I think is that if the pattern provided is shorter, and it is in LDF form, the extra pattern is basically discarded. My PR fixes that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DevelopmentConstant drops the tail factor

2 participants